SemanticScuttle - klotz.me » klotz: machine learning+clustering+k-means+k-means

klotz: machine learning* + clustering* + k-means* + k-means*

OpenAI Embeddings and Clustering for Survey Analysis — A How-To Guide

A guide on how to use OpenAI embeddings and clustering techniques to analyze survey data and extract meaningful topics and actionable insights from the responses.

The process involves transforming textual survey responses into embeddings, grouping similar responses through clustering, and then identifying key themes or topics to aid in business improvement.

2024-10-26 Tags: embedding, clustering, survey analysis, data science, visualization, k-means, tsne by klotz

A Guide to Clustering Algorithms

An overview of clustering algorithms, including centroid-based (K-Means, K-Means++), density-based (DBSCAN), hierarchical, and distribution-based clustering. The article explains how each type works, its pros and cons, provides code examples, and discusses use cases.

2024-09-06 Tags: clustering, unsupervised learning, machine learning, data science, python, k-means, k-means++, dbscan, hierarchical clustering, distribution based clustering by klotz

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

This article discusses a method for automatically curating high-quality datasets for self-supervised pre-training of machine learning systems. The method involves successive and hierarchical applications of k-means on a large and diverse data repository to obtain clusters that distribute uniformly among data concepts, followed by a hierarchical, balanced sampling step from these clusters. The experiments on three different data domains show that features trained on the automatically curated datasets outperform those trained on uncurated data while being on par or better than ones trained on manually curated data.

2024-06-01 Tags: self-supervised learning, clustering, machine learning, k-means, feature training, llm by klotz

Cluster-then-predict for classification tasks - Towards Data Science

2020-02-11 Tags: clustering, prediction, k-means, feature engineering, machine learning by klotz

K-Means & Other Clustering Algorithms: A Quick Intro with Python – LearnDataSci